Search CORE

31 research outputs found

Technical and Fundamental Features Analysis for Stock Market Prediction with Data Mining Methods

Author: Barak Sasan
Publication venue: Vysoká škola báňská - Technická univerzita Ostrava
Publication date: 01/01/2019
Field of study

Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.154 - Katedra financívyhově

DSpace at VSB Technical University of Ostrava

Meta-learning for Forecasting Model Selection

Author: Barak Sasan
Publication venue: Lancaster University
Publication date: 01/01/2021
Field of study

Model selection for time series forecasting is a challenging task for practitioners and academia. There are multiple approaches to address this, ranging from time series analysis using a series of statistical tests, to information criteria or empirical approaches that rely on cross-validated errors. In recent forecasting competitions, meta-learning obtained promising results establishing its place as a model selection alternative. Meta-learning constructs meta-features for each time series and trains a classifier on these to choose the most appropriate forecasting method. In the first part, this thesis studies the main components of meta-learning and analyses the effect of alternative meta-features, meta-learners, and base forecasters in the final model selection results. We investigate different meta-learners, the use of simple or complex base forecasts, and a large and diverse set of meta-features. Our findings show that stationarity tests, which identify the presence of unit root in time series, and proxies of autoregressive information, which show the strength of serial correlation in a series, have the highest importance for the performance of meta-learning. On the contrary, features related to time series quantiles and other descriptive statistics such as the mean, and the variance exhibit the lowest importance. Furthermore, we observe that using simple base forecasters is more sensitive to the number of groups of features employed as meta-feature and overall had worse performed. In terms of the choice of learners, classifiers with evidence of good performance in the literature resulted in the most accurate meta-learners. The success of meta-learning largely depends on its building components. The selection and generation of the appropriate meta-features remains a major challenge in meta-learning. In the second part, we propose using Convolutional Neural Networks (CNN) to overcome this. CNN have demonstrated breakthrough accuracy in pattern recognition tasks and can generate features as needed internally, within its layers, without intervention from the modeller. Using CNN, we provide empirical evidence of the efficacy of the approach, against widely accepted forecast selection methods and discuss the advantages and limitations of the proposed approach. Finally, we provide additional evidence that using meta-learning, for automated model selection, outperformed all of the individual benchmark forecasts

Lancaster E-Prints

Fusion of multiple diverse predictors in stock market

Author: Arjmand Azadeh
Barak Sasan
Ortobelli Sergio
Publication venue
Publication date: 01/01/2017
Field of study

Forecasting stock returns and their risk represents one of the most important concerns of market decision makers. Although many studies have examined single classifiers of stock returns and risk methods, fusion methods, which have only recently emerged, require further study in this area. The main aim of this paper is to propose a fusion model based on the use of multiple diverse base classifiers that operate on a common input and a Meta classifier that learns from base classifiers’ outputs to obtain more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes is determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. The results demonstrate that Bagging exhibited superior performance within the fusion scheme and could achieve a maximum of 83.6% accuracy with Decision Tree, LAD Tree and Rep Tree for return prediction and 88.2% accuracy with BF Tree, DTNB and LAD Tree in risk prediction. For feature selection part, a wrapper-GA algorithm is developed and compared with the fusion model. This paper seeks to help researcher select the best individual classifiers and fuse the proper scheme in stock market prediction. To illustrate the approach, we apply it to Tehran Stock Exchange (TSE) data for the period from 2002 to 2012

Southampton (e-Prints Soton)

DSpace at VSB Technical University of Ostrava

Lancaster E-Prints

Fuzzy turnover rate chance constraints portfolio model

Author: Barak Sasan
Publication venue: 'Elsevier BV'
Publication date: 31/01/2013
Field of study

One concern of many investors is to own the assets which can be liquidated easily. Thus, in this paper, we incorporate portfolio liquidity in our proposed model. Liquidity is measured by an index called turnover rate. Since the return of an asset is uncertain, we present it as a trapezoidal fuzzy number and its turnover rate is measured by fuzzy credibility theory. The desired portfolio turnover rate is controlled through a fuzzy chance constraint. Furthermore, to manage the portfolios with asymmetric investment return, other than mean and variance, we also utilize the third central moment, the skewness of portfolio return. In fact, we propose a fuzzy portfolio mean–variance–skewness model with cardinality constraint which combines assets limitations with liquidity requirement. To solve the model, we also develop a hybrid algorithm which is the combination of cardinality constraint, genetic algorithm, and fuzzy simulation, called FCTPM

Lancaster E-Prints

A genetic algorithm based grey goal programming (G 3) approach for parts supplier evaluation and selection

Author: Barak Sasan
Publication venue: 'Informa UK Limited'
Publication date: 17/10/2011
Field of study

The problem of part supplier selection is a major concern for all manufacturers when seeking to enhance the products’ quality and productivity. The objective of this paper is to propose an integrated genetic algorithm based grey goal programming (G3) approach to solve the part supplier selection problem. The main factor in part supplier selection is the assembly relation of the parts so as to find the suitable suppliers combination for the parts of a product. We first identify the main factors affected on supplier selection. We then present a grey-based goal programming model to work as the fitness function to evaluate the suppliers with respect to the total deviation the factors have from the ideal values. Since the objective is to find the best solution, a genetic algorithm is used to solve this problem for faster and better evaluation. The novelty of this integrated approach is to apply both qualitative and quantitative factors at once in one model and to use the grey theory to cover the lack of information of qualitative factors in order to find a solution in a near real situation

Southampton (e-Prints Soton)

Lancaster E-Prints

Dependency evaluation of financial market returns for classifying and grouping stocks

Author: Barak Sasan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Following the globalization of the economy and the increasing significance of international trade investments, linkages among economic variables of different countries are becoming strikingly evident. There is a strong interest among researchers to capture presence and extent of such negative or positive correlations. In this paper, we embark a novel methodology to identify the correlated market by a modified clustering procedure and finding the optimal number of countries within the clusters. The proposed methodology mainly works with the k-means clustering method in which its performance is improved by particle swarm optimization algorithm (PSO). The integration of these methods aims at finding the best number of clusters (k) within the dataset with a distance-based index in order to achieve the most appropriate stock market assigned to each cluster. As a case study, an experiment on daily and monthly stock market returns of 50 counties has been evaluated

Southampton (e-Prints Soton)

Transfer-entropy-based dynamic feature selection for evaluating bitcoin price drivers

Author: Barak Sasan
Publication venue
Publication date
Field of study

Despite the growing literature in cryptocurrency forecasting and their price drivers, the relationship between their price and other financial time series are an ongoing matter of debate. This study proposes a three-step methodology to cover these arguments. First, we conduct an ad-hoc analysis using transfer entropy (TE) to study the causal relationship between Bitcoin (BTC) returns and a vast array of financial time series. Then, we utilise variables with a significant amount of information flow towards BTC returns to forecast multistep-ahead BTC returns. Finally, we use explainable artificial intelligence post-hoc analysis methods to discover the contribution of each input feature to the overall forecasting. The results indicate a significant change in the information flow pattern in the first days of the COVID-19 pandemic outbreak. Additionally,our proposed TE-based feature selection method outperforms both benchmarks, a non-feature-selection model, and backward stepwise regression

Southampton (e-Prints Soton)

Outsourcing modelling using a novel interval-valued fuzzy quantitative strategic planning matrix (QSPM) and multiple criteria decision-making (MCDMs)

Author: Barak Sasan
Javanmard Shima
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Outsourcing drives companies to focus on: their capabilities, advantages of external resources, and decreasing overall operational costs. Selecting appropriate alliances, which are aligned with the company's strategies, establishes a situation through which the firms can enhance their technical capabilities and achieve new technologies. However, two critical issues in outsourcing modeling should be addressed: how to find strategic indicators for building successful alliances, and how to select these partners. Besides, as the imprecise and vague information (due to a lack of data) existing in the outsourcing models cannot be neglected, the application of fuzzy interval sets could efficiently address the complexity of these problems. To deal with these issues, this paper proposes a two-step interval-based framework for the problem. At the beginning, and for the first time, the novel integration of an interval valued fuzzy (IVF) version of strength=weakness-opportunity-threats (SWOT) technique and the quantitative strategic planning matrix (QSPM) with Gap analysis is designed to find the most effective strategies for the alliance evaluation, and to weight them. In the next step, four interval-valued version of multiple criteria decision-making methods (IVF-MCDMs) are implemented to evaluate the strategic partners. Finally, the results are aggregated with the help of the utility interval approach, and a sensitivity analysis is implemented to assess the robustness of the proposed methodology. To illustrate the efficiency of the proposed approach, a real partner selection problem at a holding car manufacturing factory in Iran is presented.Web of Science222art. no. UNSP 10749

Southampton (e-Prints Soton)

DSpace at VSB Technical University of Ostrava

Developing an approach to evaluate stocks by forecasting effective features with data mining methods

Author: Barak Sasan
Modarres Mohammad
Publication venue: 'Elsevier BV'
Publication date: 23/09/2014
Field of study

In this research, a novel approach is developed to predict stocks return and risks. In this three-stage method, through a comprehensive investigation all possible features which can be effective on stocks risk and return are identified. Then, in the next stage risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, on the basis of filter and function-based clustering; the important features in risk and return prediction are selected then risk and return re-predicted. The results show that the proposed hybrid model is a proper tool for effective feature selection and these features are good indicators for the prediction of risk and return. To illustrate the approach as well as to train data and test, we apply it to Tehran Stock Exchange (TSE) data from 2002 to 2011

Southampton (e-Prints Soton)

Lancaster E-Prints